The Universal Parser Compiler and Its Application to a Speech Translation System

نویسندگان

  • Masaru Tomita
  • Marion Kee
  • Hiroaki Saito
  • Teruko Mitamura
  • Hideto Tomabechi
چکیده

We describe our Universal Parser Architecture and its use in a speech translation system, based on the Machine Translation system under development at the Center for Machine Translation at Carnegie Mellon. To "understand" natural language, a system must have syntactic knowledge of the language and semantic knowledge of the domain. The Universal Parser Architecture allows grammar writers to develop these kinds of knowledge separately in a declarative manner, and the compiler/interpreter integrates these two knowledge bases dynamically in order to parse input sentences. We recently integrated our system with a speech recognition system to accept spoken sentences (rather than typed sentences) by extending our runtime parser component to handle "noisy" phoneme sequences (of spoken utterances) that possibly include recognition errors. The nature of this modification is explained and examples are presented. We find our architecture very suitable for speech translation, as it combines the use of domain semantic knowledge with a highly efficient runtime parsing algorithm, thus accommodating the increased search space necessary for parsing speech input. Published in the Proceedings of the Second International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages, at Carnegie-Mellon University in Pittsburgh, Pennsylvania, June, 1988. 1 Some portions of this paper are taken from the following papers: "Linguistic and Domain Knowledge Sources for the Universal Parser Architecture" by Tomita, M., Kee, M., Mitamura, T. and Carbonell, J. G.; in Terminology and Knowledge Engineering. INDEKS Verlag, Frankfurt/M., 1987 "Parsing Noisy Sentences" by Saito, H. and Tomita, M.; in proceedings of COLING88, Budapest, 1988 We should also like to thank other members of the Center for Machine Translation for useful comments and advice. Funding for this project is provided by several private institutions and governmental agencies in the United States and Japan. Table of

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Speech - to - Speech Translation System ( 1 )

We describe our Universal Parser Architecture and its use in a speech translation system, based on the Machine Translation system under development at the Center for Machine Translation at Carnegie Mellon. To "understand" natural language, a system must have syntactic knowledge of the language and semantic knowledge of the domain. The Universal Parser Architecture allows grammar writers to deve...

متن کامل

The Universal Parser Architecture for Knowledge-based Machine Translation

Machine translation should be semanticalty-accurate, linguisticallyprincipled, user-interactive, and extensible to multiple languages and domains. This paper presents the universal parser architecture that strives to meet these objectives. In essence, linguistic knowledge bases (syntactic, semantic, lexical, pragmatic), encoded in theoretically-motivated formalisms such as lexical-functional gr...

متن کامل

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

Speech Understanding and Speech Translation in Various Domains by Maximum A-posteriori Semantic Decoding

This paper describes a domain-limited system for speech understanding as well as for speech translation. An integrated semantic decoder directly converts the preprocessed speech signal into its semantic representation by a maximum a-posteriori classification. With the combination of probabilistic knowledge on acoustic, phonetic, syntactic, and semantic levels, the semantic decoder extracts the ...

متن کامل

The Swedish Core Language Engine

The paper describes a Swedish-language customization (S-CLE) of the SRI Core Language Engine, which has been developed at SICS from the original English-language version by replacing English-speciic modules with corresponding Swedish-language versions. The S-CLE is intended to be used as a building block in a broad range of applications, such as database query system, machine translation system...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005